Goto

Collaborating Authors

 semantic fluency


Cognitive Modeling of Semantic Fluency Using Transformers

arXiv.org Artificial Intelligence

Two of the most important ideas underpinning contemporary cognitive science-and the closely related AI subfield of computational cognitive modeling-are the suppositions that the human mind uses cognitive structures and that progress in understanding the mind can come from modeling those structures and the algorithms which operate on them. The semantic fluency task (SFT), sometimes called the verbal fluency task Welsh et al. [1991], is commonly employed in service of those goals. In SFT, participants name as many items belonging to a particular semantic category (animals, fruits, etc.) as they can in a fixed amount of time (typically 40-180 seconds). Despite this task's simplicity, the lists generated by participants (which we call semantic fluency lists or SFLs) offer insights into the structure of human knowledge and the heuristics used for memory retrieval. For example, words sharing semantic features tend to group in clusters, and there is often a temporal delay before a participant switches from one cluster to another. Multiple approaches to computationally modeling behaviors in SFT have been proposed Hills et al. [2012], Abbott et al. [2015], Zemla et al. [2016], Zemla and Austerweil [2017], Avery and Jones [2018], most relying on graph-based representations in which words are represented as nodes, and edges correspond to some meaningful semantic relationship between the nodes. However, to date, no work has explored whether transformer-based language models (TLMs) can be any better at modeling the generation of SFLs. And there are multiple reasons, at least from an exploratory perspective, to suspect TLMs might do well in this regard, e.g.: (1) a large body of literature demonstrates why semantic memory can not be sufficiently represented purely by fixed associative links between lexical nodes--at minimum, representations must allow for dynamic role binding, hierarchical (or otherwise unidirectional) activations, and enough richness to carry out structure-sensitive similarity assessments Holyoak and Hummel [2000], Sun [2002]; (2) TLMs perform unexpectedly well on human-oriented linguistic benchmarks Wang et al. [2019], and they are typically pre-trained using a lengthy process designed to embed deep semantic knowledge, resulting in a dense encoding of semantic relationships Cui et al. [2020]; (3) The pre-training process often proceeds by optimizing LMs to perform well on the MLM (masked language modeling) task, which shares more than a passing resemblance to the kind of word prediction that some